BASIC PYTHON FOR RESEARCHERS

by Megat Harun Al Rashid bin Megat Ahmad
last updated: April 14, 2016


4. The Sequence

Sequence is a type of data structure. It is similar to array. Each element of a sequence can be accessed according to its index. There are several type of sequences:

  1. Strings
  2. Lists
  3. Tuples
  4. Dictionaries
  5. Sets
  6. Frozen sets

The most commonly used is lists, tuples and dictionaries, which we will explored here.


4.1 The list sequence

A $list$ can be constructed using the bracket [] with the elements/components of the $list$ separated by commas.


In [1]:
List_num = [1,2,3,4,5]
print List_num
E = len(List_num)
print "There's %d elements in the list %s" % (E,List_num)


[1, 2, 3, 4, 5]
There's 5 elements in the list [1, 2, 3, 4, 5]

Elements in a $list$ can be made of numbers, strings, a mixture of both or other type of sequences. Element can be accessed by specifying the element position in the $list$ (similar to accessing the strings, as discussed in Tutorial 2). The number of elements in $list$ can be known using the $len$() function.


In [2]:
List_str = ["Blythe","Rafa","Felicity","Kiyoko"]

print "What is the word happy in Arabic?"
print "The answer is %s and it's starts with the capital letter %s." % \
(List_str[1],List_str[1][0])


What is the word happy in Arabic?
The answer is Rafa and it's starts with the capital letter R.

In the above example, the elements in $List$_$str$ $list$ is accessed by specifying the positional index (in square bracket after the variable name of the list) of the $list$ and after that the element of the strings is accessed by specifying a second bracketted index as strings is also a $list$.

Acessed element of a $list$ can be operated (just like variable)


In [3]:
List_mix = ["The Time Machine", 1895, "The Invisible Man", 1897, \
            "The Shape of Things to Come", 1933]
print '"%s" was first published in %d with \n"%s" published %d years \
later.' % (List_mix[0],List_mix[1],List_mix[4],List_mix[5]-List_mix[1])


"The Time Machine" was first published in 1895 with 
"The Shape of Things to Come" published 38 years later.

Example 4.1: The followings are some of the infamous implementations of Python programming language: CPython, Cython, PyPy, IronPython, Jython and Unladen Swallow. Put this sequence in a list and rearrange the sequence according to your preferred implementations in a list that contains only three implementations. Print the new list.


In [4]:
Python_Impl = ['CPython','Cython','PyPy','IronPython','Jython','Unladen Swallow']
New_Python_Impl = [Python_Impl[1],Python_Impl[0],Python_Impl[2]]

print New_Python_Impl


['Cython', 'CPython', 'PyPy']

Example 4.2: From this list: ['Python','Java','C','Perl','Sed','Awk','Lisp','Ruby'], create back the original list of Python implementations.


In [5]:
PN = ['Python','Java','C','Perl','Sed','Awk','Lisp','Ruby']

Pyth_ImplN = [PN[2]+PN[0],PN[2]+PN[0][1:],PN[0][:2]*2,\
              PN[6][1].upper()+PN[3][2]+PN[0][4:]+PN[0],\
              PN[1][0]+PN[0][1:],PN[7][1].upper()+PN[0][-1]+PN[3][-1]+\
              PN[1][1]+PN[4][-1]+PN[4][-2]+PN[0][-1]+' '+\
              PN[-2][-2].upper()+PN[-3][1]+PN[1][1]+PN[3][3]*2+\
              PN[0][-2]+PN[-3][1]]

print Pyth_ImplN


['CPython', 'Cython', 'PyPy', 'IronPython', 'Jython', 'Unladen Swallow']

What we have seen are one-dimensional homogeneous and non-homogeneous $lists$.


In [6]:
x = [12,45,78,14,23]
y = ["Dickens","Hardy","Austen","Steinbeck"]
Z = [3E8,'light',"metre"]

$List$ can also be multi-dimensional.


In [7]:
# Homogeneous multi dimensional list (2D):

# List_name[row][column]

x2 = [[12,32],[43,9]]
print x2
print x2[1]              # Second row
print x2[0][1]           # First row, second column


[[12, 32], [43, 9]]
[43, 9]
32

In a matrix representation, this is: $$\left( \begin{array}{cc} 12 & 32 \\ 43 & 9\end{array} \right)$$

and to get the matrix determinant:


In [8]:
# Matrix determinant

det_x2 = x2[0][0]*x2[1][1]-x2[0][1]*x2[1][0]
print "Determinant of x2 is %d" % det_x2


Determinant of x2 is -1268

A multi-dimensional $list$ is actually $lists$ within $list$:


In [9]:
x1 = [0.1,0.2,0.3,0.4,0.5]
x2 = [0,12,34,15,1]
x = [x1,x2]
print x              # A 2x5 Array


[[0.1, 0.2, 0.3, 0.4, 0.5], [0, 12, 34, 15, 1]]

$List$ can also be non-homogeneous multi-dimensional:


In [10]:
Data_3D = [[[2,3,5],[1,7,0]],[5,"ArXiv"]]

#print number 7
print Data_3D[0][1][1]

print 'Mr. Perelman published the solution \
to Poinc%sre conjecture in "%s".' % (u"\u00E1", Data_3D[1][1])


7
Mr. Perelman published the solution to Poincáre conjecture in "ArXiv".

Data_3D is actually $lists$ inside $lists$ inside $list$ but non-homogeneously.

The elements in the $list$ can be subtituted.


In [11]:
# Extracting and substitution

L1 = Data_3D[0]; print L1
L2 = [Data_3D[1]]+[Data_3D[0][0]]
print L2
print L2[0][1]
Data_3D[1][1] = "PlosOne"
print Data_3D


[[2, 3, 5], [1, 7, 0]]
[[5, 'ArXiv'], [2, 3, 5]]
ArXiv
[[[2, 3, 5], [1, 7, 0]], [5, 'PlosOne']]

Iterating on elements in list requires the sequential accessing of the list. This can be done using $for$ and $while$ control structures as well as the $enumerate$() function.


In [12]:
# Looping: for

dwarf = ["Eris","Pluto","Makemake","Haumea","Sedna"]
print dwarf


['Eris', 'Pluto', 'Makemake', 'Haumea', 'Sedna']

In [13]:
for name in dwarf:
    print name


Eris
Pluto
Makemake
Haumea
Sedna

In [14]:
for z in range(len(dwarf)):
    print "%d\t%s" % (z,dwarf[z])


0	Eris
1	Pluto
2	Makemake
3	Haumea
4	Sedna

In [15]:
for x,z in enumerate(dwarf,1):
    print "%d\t%s" % (x,z)


1	Eris
2	Pluto
3	Makemake
4	Haumea
5	Sedna

In [16]:
z = 0
while z < len(dwarf):
    print "%d\t%s" % (z+1,dwarf[z])
    z = z + 1


1	Eris
2	Pluto
3	Makemake
4	Haumea
5	Sedna

Example 4.3: Calculate and print each value of x*y with:

x = [12.1,7.3,6.2,9.9,0.5]

y = [4.5,6.1,3.9,1.7,8.0]



In [17]:
x = [12.1,7.3,6.2,9.9,0.5]
y = [4.5,6.1,3.9,1.7,8.0]

i = 0
xy = []      # Creating empty list
while i < (len(x)):
    xy = xy + [x[i]*y[i]]     # Appending result into list
    print '%.1f x %.1f = %.2f' % (x[i],y[i],xy[i])
    i = i + 1
print '\n'        
print xy


12.1 x 4.5 = 54.45
7.3 x 6.1 = 44.53
6.2 x 3.9 = 24.18
9.9 x 1.7 = 16.83
0.5 x 8.0 = 4.00


[54.449999999999996, 44.529999999999994, 24.18, 16.830000000000002, 4.0]

Example 4.4: Calculate and print each value of x2*y2 with:

x2 = [[12.1,7.3],[6.2,9.9]]

y2 = [[4.5,6.1],[3.9,1.7]]



In [18]:
x2 = [[12.1,7.3],[6.2,9.9]]
y2 = [[4.5,6.1],[3.9,1.7]]

j = 0
xy2 = []
xy3 = []
while j < (len(x2)):
    k = 0
    for k in range(len(x2)):
        xy3 = xy3 + [x2[j][k]*y2[j][k]]
        print '%.1f x %.1f = %.2f' % (x2[j][k],y2[j][k],xy3[k])
        k = k + 1
    xy2 = xy2 + [xy3]
    xy3 = []
    j = j + 1
print '\n'     
print xy2


12.1 x 4.5 = 54.45
7.3 x 6.1 = 44.53
6.2 x 3.9 = 24.18
9.9 x 1.7 = 16.83


[[54.449999999999996, 44.529999999999994], [24.18, 16.830000000000002]]

Example 4.5: Just create a list that contains the $f(x)$ value of a Gaussian distribution with $\sigma$ = 0.4 and $\mu$ = 5.

The Gaussian function:$$f(x) = e^{\frac{-(x-\mu)^2}{2\sigma^2}}$$


In [19]:
from math import *

sigma = 0.4
mu = 5.0

x_val = []
ctr = 3
while ctr < 7:
    x_val = x_val + [ctr]
    ctr = ctr + 0.1

fx = []
for n in range(0,len(x_val),1):
    intensity = exp(-(x_val[n]-mu)**2/(2*sigma**2))
    fx = fx + [intensity]
    print '%f\t%s' % (intensity,int(intensity*50)*'*')

fx


0.000004	
0.000013	
0.000040	
0.000120	
0.000335	
0.000884	
0.002187	
0.005086	
0.011109	
0.022794	*
0.043937	**
0.079560	***
0.135335	******
0.216265	**********
0.324652	****************
0.457833	**********************
0.606531	******************************
0.754840	*************************************
0.882497	********************************************
0.969233	************************************************
1.000000	**************************************************
0.969233	************************************************
0.882497	********************************************
0.754840	*************************************
0.606531	******************************
0.457833	**********************
0.324652	****************
0.216265	**********
0.135335	******
0.079560	***
0.043937	**
0.022794	*
0.011109	
0.005086	
0.002187	
0.000884	
0.000335	
0.000120	
0.000040	
0.000013	
0.000004	
Out[19]:
[3.7266531720786777e-06,
 1.2607105177048545e-05,
 4.006529739295121e-05,
 0.00011961288358102479,
 0.00033546262790251364,
 0.0008838263069350546,
 0.0021874911181828985,
 0.005086069231012732,
 0.011108996538242375,
 0.022794180883612486,
 0.04393693362340769,
 0.07955950871822796,
 0.13533528323661287,
 0.2162651668298872,
 0.32465246735834913,
 0.45783336177161305,
 0.6065306597126316,
 0.7548396019890051,
 0.8824969025845932,
 0.9692332344763427,
 1.0,
 0.9692332344763459,
 0.8824969025845991,
 0.7548396019890127,
 0.6065306597126396,
 0.45783336177162065,
 0.3246524673583556,
 0.21626516682989225,
 0.13533528323661645,
 0.07955950871823037,
 0.043936933623409155,
 0.022794180883613388,
 0.0111089965382429,
 0.005086069231013008,
 0.002187491118183033,
 0.0008838263069351174,
 0.00033546262790254015,
 0.00011961288358103563,
 4.006529739295526e-05,
 1.2607105177049956e-05,
 3.7266531720791343e-06]

4.1.1 Converting data from a file into a list

Each line in a file can be directly converted to a list using the $readlines$() function. For instance, in section 2.4 of tutorial 2, instead of using $read$() function, we can use the $readlines$() function to convert each line in the file $les miserables.txt$ as elements of a list $linecontent$:


In [20]:
# Opening a file 
file_read = open("Tutorial2/les miserables.txt")
linecontent = file_read.readlines()
file_read.close()

The elements of $linecontent$ is now the lines in $les miserables.txt$ (including the escape character):


In [21]:
linecontent


Out[21]:
['Preface from Les Miserables\n',
 '\n',
 'So long as there shall exist, by reason of law and custom,\n',
 'a social condemnation, which, in the face of civilisation,\n',
 'artificially creates hells on earth, and complicates a\n',
 'destiny that is divine, with human fatality; so long as the\n',
 'three problems of the age - the degradation of man by poverty,\n',
 'the ruin of woman by starvation, and the dwarfing of childhood\n',
 'by physical and spiritual night - are not solved; so long as,\n',
 'in certain regions, social asphyxia shall be possible; in other\n',
 'words, and from a yet more extended point of view, so long as\n',
 'ignorance and misery remain on earth, books like this cannot\n',
 'be useless.\n',
 '\n',
 'HAUTEVILLE HOUSE, 1862.\n',
 'FANTINE']

4.2 The Tuples

A $tuple$ can be declared using the round bracket. A $tuple$ is actually a $list$ that contains element that cannot be modified or subtituted. Apart from that, its has similar properties with $list$.


In [22]:
t1 = (1,2,3,4)
t1


Out[22]:
(1, 2, 3, 4)

Attempting to substitute a $tuple$ element will give an error.


In [23]:
t1[1] = 5


---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-23-d1336156237f> in <module>()
----> 1 t1[1] = 5

TypeError: 'tuple' object does not support item assignment

4.3 The Dictionaries

$Dictionaries$ are similar to "associative arrays" in many other programming language. $Dictionaries$ are indexed by keys that can be strings or numbers. Acessing data in $dictionaries$ is by specifying the keys instead of index number. Data can be anything including other $dictionaries$. $Dictionaries$ can be declared by using the curly brackets with the pair of key and data separated by '$:$' and each pair of this element separated by '$,$'.


In [24]:
# Nearby stars to the earth

Stars = {1:'Sun', 2:'Alpha Centauri', 3:"Barnard's Star",\
         4:'Luhman 16', 5:'WISE 0855-0714'}

Stars[3]      # Specify the key instead of index number


Out[24]:
"Barnard's Star"

In the above example the keys are made of integers whereas the data are all made of strings. It can also be the opposite:


In [25]:
# Distance of nearby stars to the earth

Stars_Dist = {'Sun':0, 'Alpha Centauri':4.24, "Barnard's Star":6.00,\
         'Luhman 16':6.60, 'WISE 0855-0714':7.0}

print 'Alpha Centauri is %.2f light years from earth.' % \
(Stars_Dist['Alpha Centauri'])


Alpha Centauri is 4.24 light years from earth.

These informations can be made more structured by using $list$ as data in $dictionary$.


In [26]:
# A more structured dictionaries data

Stars_List = {1:['Sun',0], 2:['Alpha Centauri',4.24],\
              3:["Barnard's Star",6.00], 4:['Luhman 16',6.60],\
              5:['WISE 0855-0714',7.0]}

print '%s is the fourth closest star at about %.2f light \
\nyears from earth.' % (Stars_List[4][0],Stars_List[4][1])


Luhman 16 is the fourth closest star at about 6.60 light 
years from earth.

Below is the example of $dictionary$ that contains $dictionary$ type data and the ways to access them.


In [27]:
# Declaring dictionaries data for the dictionary 'Author'

Coetzee = {1974:'Dusklands',
           1977:'In The Heart Of The Country',
           1980:'Waiting For The Barbarians',
           1983:'Life & Times Of Michael K'}

McCarthy = {1992:'All the Pretty Horses',
            1994:'The Crossing',
            1998:'Cities of the Plain',
            2005:'No Country for Old Men',
            2006:'The Road'}

Steinbeck = {1937:'Of Mice And Men',
             1939:'The Grapes Of Wrath',
             1945:'Cannery Row',
             1952:'East Of Eden',
             1961:'The Winter Of Our Discontent'}

Lewis = {'Narnia Series':{1950:'The Lion, the Witch and the Wardrobe',
         1951:'Prince Caspian: The Return to Narnia',
         1952:'The Voyage of the Dawn Treader',
         1953:'The Silver Chair',
         1954:'The Horse and His Boy',
         1955:"The Magician's Nephew",
         1956:'The Last Battle'
         }}

# Assigning keys and data for the dictionary 'Author'
# one of it is a dictionary list

Author = {'South Africa':Coetzee,'USA':[McCarthy,Steinbeck],
          'British':Lewis}

In [28]:
Author['South Africa'][1983]


Out[28]:
'Life & Times Of Michael K'

In [29]:
Author['USA'][1][1939]


Out[29]:
'The Grapes Of Wrath'

In [30]:
Author['British']['Narnia Series'][1953]


Out[30]:
'The Silver Chair'

More on lists and dictionaries can be found on https://docs.python.org/2/tutorial/datastructures.html